Skip to main content

More right-solid
SHOW DETAILS
eye
Title
Date Reviewed
Review
The Dataset Collection
favoritefavoritefavoritefavoritefavorite Jul 22, 2015
data
eye 50,498
favorite 8
comment 3
favoritefavoritefavoritefavoritefavorite

Find the dataset available for instant analysis in BigQuery and queries on this reddit...

(Here is the original Reddit comment announcing this collection of data and what the processes were.) This is an archive of Reddit comments from October of 2007 until May of 2015 (complete month). This reflects 14 months of work and a lot of API calls. This dataset includes nearly every publicly available Reddit comment. Approximately 350,000 comments out of ~1.65 billion were unavailable due to Reddit API issues. Q: How are the files structured? Each file is compressed with bzip2 compression....

Find the dataset available for instant analysis in BigQuery and queries on this reddit...

Community Texts
software
eye 908
favorite 2
comment 0

Archive of yet another batch of images that were on 4chan. Found this file on MEGA.
Topics: mega, mega.co.nz, 4chan, images, archive, /b/

Community Software
Sep 17, 2016
software
eye 1,591
favorite 2
comment 1

WGET from yoloselfie.com open images directory. 12,615 images @ 3.305GB | 30 March 2016
Topics: yoloselfie.com, images, porn, xxx, yolo, selfie, pictures, ohhdemgirls, nudes

Community Video
movies
eye 8,412
favorite 2
comment 0

zenguy_pc gonewilder archive from the last 500 days or so /// 491GB /// 20,000+ users from r/gonewild and related subs, user folders tared to minimize file count, database and logs are included but I don't recomend using the database for anything other than study of the contained data, help me archive this and have fun :D Seeding original torrent file until 01/01/2017 magnet:?xt=urn:btih:52e2f1f19f819b741be7fe1c5e84df544aebd722&dn=zenguys_gonewilder_NOV2016 Tool...
Topics: zenguy, ohhdemgirls, gonewild, gonwilder
Source: torrent:urn:sha1:52e2f1f19f819b741be7fe1c5e84df544aebd722

Community Software
software
eye 2,065
favorite 2
comment 0

Selfie tumblr blogs march 2015 Output ./ - the current directory     / - your blog backup         index.html - table of contents with links to the monthly pages         backup.css - the default backup style sheet         archive/             .html - the monthly pages             …         posts/             .html - the single post pages             …         images/             - the image files             …         xml/    ...
Topics: tumblr, selfies, blogs100selfshot, 1selfishoti, agselfshot, appleselfshot, asianteengf, assselfie,...

The Dataset Collection
data
eye 4,476
favorite 2
comment 0

I took the Reddit comment archive and converted all the JSON into one SQLite database using this program that I wrote: https://gist.github.com/ers35/3b615a75fa0ed5e6d5cc I ran a few tests to make sure the number of database rows matches the number of JSON records. "SELECT MAX(rowid) FROM comment" and "SELECT COUNT(id) FROM comment" both return 1659361605. This gives me some confidence as to the integrity of the dataset, but I cannot be 100% sure. The compressed size is 163G....

Reddit Comment and Post JSON 2007-2013(06) Downloaded 2013(06)
Topics: reddit, data, comments, posts, history, json, social, media, cats, porn

Community Data
texts
eye 13,138
favorite 6
comment 0

Gonewild Data 2009 - 2013(6) This release includes imgur hosted single images (live as of Apr 2nd 2014) for albums look out for the next release. Dead links were removed but I didn't account for junk images or reclaimed urls. # 2009 PNG: 25 JPG: 829 GIF: 6 # 2010 PNG: 260 JPG: 6666 GIF: 21 # 2011 PNG: 769 JPG: 21408 GIF: 132 # 2012 PNG: 2520 JPG: 39288 GIF: 340 # 2013/6 PNG: 1807 JPG: 34519 GIF: 333 Total: 108,964*
Topics: gonewild, images, nsfw, reddit, ohhdemgirls
Source: torrent:urn:sha1:4011d4dd824ac3209d87956235db60c6cfa9e9f4

Community Texts
data
eye 4,568
favorite 11
comment 0

http://amabitch.com/ galleries as of may 10th 2015
Topics: amabitch, porn